What is an Autocomplete System?

Search Autocomplete is a widely-used feature in modern applications like Google, Amazon, and YouTube. It enhances user experience by providing real-time suggestions based on partial input, helping users complete queries faster and discover popular or relevant search terms.

Search Autocomplete

18 words in dictionary

Dictionary

appleapplicationapplyappbananabandbankcatcarcardcaredogdoordownsearchseasealseason

As the user types a query character by character, the system should return the top N suggestions that match the current prefix. For example, typing "app" might yield results like "apple", "app store", or "application".

These suggestions are typically ranked by relevance, popularity, frequency, or recency.

In this chapter, we will explore the low-level design of a Search Autocomplete system in detail.

Letâ€™s start by clarifying the requirements:

1. Clarifying Requirements

Before diving into the design, itâ€™s important to clarify how the autocomplete system is expected to behave. Asking targeted questions helps refine assumptions, define the scope, and align on core expectations for the system.

Discussion

Candidate: Should the autocomplete system be case-sensitive?

Interviewer: No, all inputs should be treated as lowercase. The system should be case-insensitive.

Candidate: Should the system only support English, or do we need to account for Unicode/multilingual input?

Interviewer: Letâ€™s assume only English characters for now.

Candidate: How should suggestions be rankedâ€”alphabetically, by frequency of use, or both?

Interviewer: Good question. The system should support both strategies. The user of the system should be able to configure the ranking strategy.

Candidate: How many suggestions should be returned per prefix?

Interviewer: That should be configurable, perhaps a default of 10, but the system should allow specifying a custom limit.

Candidate: How does the system learn word frequencies? Are we tracking every time a word is added or searched?

Interviewer: Letâ€™s increment the frequency every time a word is inserted into the system..

Candidate: Can users input new words over time, or is the dictionary fixed at initialization?

Interviewer: Words can be added dynamically during runtime.

Candidate: Should we support deleting a word or updating its frequency?

Interviewer: No, we can skip delete and update functionality for now.

After gathering the details, we can summarize the key system requirements.

1.1 Functional Requirements

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 2 3 4 5 67 8 9 1011 12 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 4243 class=ml-4>

The system supports inserting words into an internal dictionary.
It returns suggestions when a user types a prefix.
Suggestions are ranked based on a configurable strategy (alphabetical or frequency-based).
The number of suggestions returned is configurable.
Frequency count is incremented each time a word is added.
Words and prefixes are treated case-insensitively.
The system can handle dynamic insertion of words at runtime.

1.2 Non-Functional Requirements

The system should be fast and optimized for real-time suggestions.
It should be modular and follow object-oriented design principles.
The design should be easily extensible to add new ranking strategies or support new languages in the future.
The system can assume in-memory data storage and doesnâ€™t need persistence.

After the requirements are clear, the next step is to identify the core entities that we will form the foundation of our design.

2. Identifying Core Entities

Core entities are the fundamental building blocks of our system. We identify them by analyzing the functional requirements and highlighting the key nouns and responsibilities that naturally map to object-oriented abstractions such as classes, enums, or interfaces.

Letâ€™s walk through the functional requirements and extract the relevant entities:

1. The system must store a large dictionary of words and their usage frequencies efficiently.

This requirement is central to the system's performance. A standard list or hash map would be inefficient for prefix-based searches. This points to a specialized data structure, the Trie.

Each node in the Trie represents a character, forming a tree-like structure of prefixes. Therefore, we need a TrieNode entity to be the basic building block and a Trie class to manage the overall structure, including word insertions and prefix searches. The TrieNode will also need to store the frequency of each completed word.

2. Given a prefix, the system must return a list of potential full-word suggestions.

This involves two steps: finding all words that start with the given prefix and then packaging them for further processing. The Trie class is responsible for traversing from the prefix's end node to find all valid descendant words. To hold the results of this traversal, which includes both the word and its associated data (like frequency), we need a simple data-transfer object. This leads to the Suggestion entity, which encapsulates a word and its ranking weight.

TrieNode: The fundamental building block of the Trie. Each node represents a character, contains references to its children, a flag to indicate if it marks the end of a word, and a counter for the word's frequency.
Trie: The core data structure that organizes TrieNodes to store words for efficient prefix-based lookups. It provides methods to insert words and to search for all words starting with a given prefix.
Suggestion: A simple data object used to pair a potential word with its ranking weight (e.g., frequency). This object is used internally to facilitate the ranking process.
AutocompleteSystem: The main public-facing class that acts as a facade for the entire system. It integrates the Trie and a selected IRankingStrategy to provide simple methods for adding words and retrieving ranked suggestions.

These core entities define the key abstractions of the autocomplete system and will guide the structure of our low-level design and class diagrams.

3. Class Design

3.1 Class Definitions

Data Classes

These classes primarily act as data containers with minimal behavior.

TrieNode

Represents a single node in the Trie data structure. It's the fundamental unit for storing character-level information and word metadata.

Attributes:

children: A dictionary (Map or Dictionary) mapping a char to its corresponding child TrieNode.
isEndOfWord: A boolean flag indicating whether this node represents the end of a complete word.
frequency: An integer that counts how many times the word ending at this node has been inserted.

Suggestion

A simple data transfer object (DTO) that encapsulates a word and its associated ranking weight, making it easy to pass around between the Trie collection logic and the ranking strategy.

Attributes:

word: A string representing the suggested word.
weight: An int representing the value used for ranking (e.g., frequency).

Core Classes

Trie

Attributes:

root: The root TrieNode of the Trie.

Methods:

Insert(string word): Adds a word to the Trie, creating nodes for each character and updating the frequency count.
SearchPrefix(string prefix): Traverses the Trie to find the node corresponding to the end of a given prefix. Returns null if the prefix does not exist.
CollectSuggestions(TrieNode startNode, string prefix): Performs a depth-first search (DFS) from a given startNode to gather all complete words and returns them as a list of Suggestion objects.

AutoCompleteSystem

Attributes:

trie: An instance of the Trie class to store all words.
rankingStrategy: An instance of IRankingStrategy to define how suggestions are ranked.
maxSuggestions: An integer specifying the maximum number of suggestions to return.

Methods:

AddWord(string word) / AddWords(List<string> words): Public methods to populate the Trie.
GetSuggestions(string prefix): The primary method for clients. It orchestrates the process of searching the Trie, collecting suggestions, ranking them using the rankingStrategy, and returning the top results.

3.2 Class Relationships

The relationships define how our classes interact.

Composition (`"has-a"`)

AutocompleteSystem has a Trie and has a RankingStrategy. The AutocompleteSystem manages the lifecycle and coordinates the actions of these components.

Trie is composed of TrieNode objects. The Trie class is responsible for creating and linking TrieNodes to form the prefix tree structure.

Dependency (`"uses-a"`)

AutocompleteSystem uses Suggestion objects as an intermediary data structure.

3.3 Key Design Patterns

Strategy Pattern

The RankingStrategy interface and its concrete implementations (FrequencyBasedRanking, AlphabeticalRanking) are a clear application of the Strategy Pattern. This decouples the core suggestion-finding logic within AutocompleteSystem from the ranking algorithm.

It allows us to change the ranking behavior at runtime or easily add new ranking methods (e.g., by recency, by location) without modifying the AutocompleteSystem class.

Builder Pattern

The AutocompleteSystemBuilder class implements the Builder Pattern. This pattern simplifies the creation of the AutocompleteSystem object, which has multiple configuration parameters.

It provides a fluent, readable API for constructing the object step-by-step and separates the complex construction logic from the final object representation.

Facade Pattern

The AutocompleteSystem class itself serves as a Facade. It provides a simple, high-level interface (AddWord, GetSuggestions) to a more complex underlying subsystem composed of the Trie, Suggestion objects, and the RankingStrategy. Clients interact with this simple facade, remaining unaware of the intricate logic of Trie traversal and suggestion ranking.

3.4 Full Class Diagram

4. Implementation

4.1 TrieNode

1class TrieNode: def __init__(self): self.children: Dict[str, TrieNode] = {} self.is_end_of_word: bool = False self.frequency: int = 0 def get_children(self) -> Dict[str, TrieNode]: return self.children def is_end_of_word_check(self) -> bool: return self.is_end_of_word def set_end_of_word(self, end_of_word: bool): self.is_end_of_word = end_of_word def get_frequency(self) -> int: return self.frequency def increment_frequency(self): self.frequency += 1

4.2 Suggestion

1class Suggestion: def __init__(self, word: str, weight: int): self.word = word self.weight = weight def get_word(self) -> str: return self.word def get_weight(self) -> int: return self.weight

4.3 Trie

1class Trie: def __init__(self): self.root = TrieNode() def insert(self, word: str): current = self.root for ch in word: if ch not in current.get_children(): current.get_children()[ch] = TrieNode() current = current.get_children()[ch] current.set_end_of_word(True) current.increment_frequency() def search_prefix(self, prefix: str) -> Optional[TrieNode]: current = self.root for ch in prefix: node = current.get_children().get(ch) if node is None: return None current = node return current def collect_suggestions(self, start_node: TrieNode, prefix: str) -> List[Suggestion]: suggestions = [] self._collect(start_node, prefix, suggestions) return suggestions def _collect(self, node: TrieNode, current_prefix: str, suggestions: List[Suggestion]): if node.is_end_of_word_check(): suggestions.append(Suggestion(current_prefix, node.get_frequency())) for ch in node.get_children().keys(): self._collect(node.get_children()[ch], current_prefix + ch, suggestions)

4.4 RankingStrategy

1class RankingStrategy(ABC): @abstractmethod def rank(self, suggestions: List[Suggestion]) -> List[Suggestion]: pass class=token style=color:rgb(139,233,253)>class AlphabeticalRanking(RankingStrategy): def rank(self, suggestions: List[Suggestion]) -> List[Suggestion]: return sorted(suggestions, key=lambda s: s.get_word()) class=token style=color:rgb(139,233,253)>class FrequencyBasedRanking(RankingStrategy): def rank(self, suggestions: List[Suggestion]) -> List[Suggestion]: return sorted(suggestions, key=lambda s: s.get_weight(), reverse=True)

4.5 AutocompleteSystem

1class AutocompleteSystem: def __init__(self, ranking_strategy: RankingStrategy, max_suggestions: int): self.trie = Trie() self.ranking_strategy = ranking_strategy self.max_suggestions = max_suggestions def add_word(self, word: str): self.trie.insert(word.lower()) def add_words(self, words: List[str]): for word in words: self.add_word(word) def get_suggestions(self, prefix: str) -> List[str]: prefix_node = self.trie.search_prefix(prefix.lower()) if prefix_node is None: return [] raw_suggestions = self.trie.collect_suggestions(prefix_node, prefix.lower()) ranked_suggestions = self.ranking_strategy.rank(raw_suggestions) return [s.get_word() for s in ranked_suggestions[:self.max_suggestions]]

4.6 AutocompleteSystemBuilder

1class AutocompleteSystemBuilder: def __init__(self): self.ranking_strategy = FrequencyBasedRanking()  # Default strategy self.max_suggestions = 10  # Default limit def with_ranking_strategy(self, strategy: RankingStrategy): self.ranking_strategy = strategy return self def with_max_suggestions(self, max_suggestions: int): self.max_suggestions = max_suggestions return self def build(self) -> AutocompleteSystem: return AutocompleteSystem(self.ranking_strategy, self.max_suggestions)

4.7 AutocompleteDemo

1def main(): print("----------- SCENARIO 1: Frequency-based Ranking -----------") # 1. Build the system with the default frequency-based strategy system_by_frequency = (AutocompleteSystemBuilder() .with_max_suggestions(5) .with_ranking_strategy(FrequencyBasedRanking()) .build()) # 2. Feed data into the system # 'canada' is added most frequently, followed by 'car' dictionary = [ "car", "cat", "cart", "cartoon", "canada", "candy", "car", "canada", "canada", "car", "canada", "canopy", "captain" ] system_by_frequency.add_words(dictionary) # 3. Get suggestions for a prefix prefix1 = "ca" suggestions1 = system_by_frequency.get_suggestions(prefix1) print(f"Suggestions for '{prefix1}': {suggestions1}") prefix2 = "car" suggestions2 = system_by_frequency.get_suggestions(prefix2) print(f"Suggestions for '{prefix2}': {suggestions2}") print("\n----------- SCENARIO 2: Alphabetical Ranking -----------") # 1. Build a new system with the alphabetical strategy system_alphabetical = (AutocompleteSystemBuilder() .with_ranking_strategy(AlphabeticalRanking()) .build()) # 2. Feed the same data system_alphabetical.add_words(dictionary) # 3. Get suggestions for the same prefix suggestions3 = system_alphabetical.get_suggestions(prefix1) print(f"Suggestions for '{prefix1}' (alphabetical): {suggestions3}") class=token style=color:rgb(139,233,253)>if __name__ == "__main__": main()

5. Run and Test

Languages

Java

Python

C++

Files9

builder

core

strategy

auto_complete_demo.py

main

auto_complete_system.py

auto_complete_demo.py

def main():
    print("----------- SCENARIO 1: Frequency-based Ranking -----------")
    # 1. Build the system with the default frequency-based strategy
    system_by_frequency = (AutocompleteSystemBuilder()
                          .with_max_suggestions(5)
                          .with_ranking_strategy(FrequencyBasedRanking())
                          .build())
    # 2. Feed data into the system
    # 'canada' is added most frequently, followed by 'car'
    dictionary = [
        "car", "cat", "cart", "cartoon", "canada", "candy",
        "car", "canada", "canada", "car", "canada", "canopy", "captain"
    ]
    system_by_frequency.add_words(dictionary)
    # 3. Get suggestions for a prefix
    prefix1 = "ca"

Output

6. Quiz

Design Search Autocomplete System Quiz

1 / 21

Multiple Choice

Which entity is most responsible for supporting efficient prefix-based search in a Search Autocomplete system?

How helpful was this article?

Comments (1)

0/2000

Sort by

Rishabh Raj15 days ago

in AutocompleteSystemBuilder there is no need for constructor, also the complete builder implementation does not seem to as per builder pattern, there is no nesting of classes like conventional builder.

Design Search Autocomplete System

Ashish Pratap Singh

What is an Autocomplete System?

1. Clarifying Requirements

1.1 Functional Requirements

1.2 Non-Functional Requirements

2. Identifying Core Entities

1. The system must store a large dictionary of words and their usage frequencies efficiently.

2. Given a prefix, the system must return a list of potential full-word suggestions.

3. Class Design

3.1 Class Definitions

Data Classes

TrieNode

Suggestion

Core Classes

Trie

AutoCompleteSystem

3.2 Class Relationships

Composition ("has-a")

Dependency ("uses-a")

3.3 Key Design Patterns

Strategy Pattern

Builder Pattern

Facade Pattern

3.4 Full Class Diagram

4. Implementation

4.1 TrieNode

4.2 Suggestion

4.3 Trie

4.4 RankingStrategy

4.5 AutocompleteSystem

4.6 AutocompleteSystemBuilder

4.7 AutocompleteDemo

5. Run and Test

6. Quiz

Design Search Autocomplete System Quiz

How helpful was this article?

Comments (1)

Composition (`"has-a"`)

Dependency (`"uses-a"`)